Draft

Download a species list and cross-reference with conservation status lists in R

Knowing what species have been observed in a local area is a regular task for ecosystem management. Here we show how to make a species list with galah and how to cross-reference this list with threatened and sensitive species lists. We then show how to visualise this information as a waffle chart using {waffle} & {ggplot2}.

Eukaryota
Animalia
Plantae
Summaries
R
Authors

Dax Kellie

Amanda Buyan

Published

May 2, 2024

Author

Dax Kellie
Amanda Buyan

Date

2 May 2024

Knowing what species inhabit an area is important for conservation and ecosystem management. In particular, it can help us find out how many known species are in a given area, whether each species is common or rare, and whether any species are threatened or endangered.

In this post, we will use the galah, waffle and ggplot2 packages to show you how to download a list of species within the Yass Valley in 2023, cross-reference this list with state-wide conservation status lists, and visualise the number of threatened and sensitive species in the region.

For those unfamiliar with Australian geography, Yass Valley is located here:

Download a list of species

There are two ways to narrow a download query to return information for a specific region:

  • Using fields available in galah (downloaded from the ALA)
  • Using a shapefile

The method you choose depends on whether the region you wish to return species for is already within galah, or whether you require a list for a more specific area defined by a shapefile.

First let’s load our packages.

library(dplyr)
library(ggplot2)
library(readr)
library(sf)
library(rmapshaper)
library(here)
library(galah)

To download species lists, you will also need to enter a registered email with the ALA using galah_config().

galah_config(email = "your-email-here")

Cross-reference with threatened and sensitive species lists

Next we will compare our Yass valley species list with several state-wide conservation status lists of threatened and sensitive species. We can retrieve lists of threatened and sensitive species in one of two ways:

  • Use the lists available in the Atlas of Living Australia
  • Use your own list

Both use the same method of matching species names in our Yass Valley list to species names in official conservation status lists. However, there is a slightly different workflow between using galah and using an externally downloaded list. Choose from the options below to use either method.

Visualise species conservation status

One useful way to visualise the number of threatened and sensitive species in an area is using a waffle chart. Waffle charts are useful because they can show the total number of species (represented as individual square units) and proportions of different groups (represented by colours).

For example, we can visualse the number and proportion of species with different conservation status, along with a taxonomic breakdown of threatened/sensitive species.

Code
library(waffle)
library(showtext)
library(glue)

# Add conservation status & taxa groups for plotting
species_yass_grouped <- species_yass |>
  mutate(
    conservation_status = case_when(
      species_name %in% yass_sensitive$species_name ~ "Sensitive",
      species_name %in% yass_threatened$species_name ~ "Threatened",
      .default = "No status"
    ),
    taxa_group = case_when(
      class == "Aves" ~ "Birds",
      class == "Reptilia" ~ "Reptiles",
      class == "Mammalia" ~ "Mammals",
      kingdom == "Plantae" ~ "Plants",
      .default = "Other"
    )
  )

# Count number of species by conservation status
status_table <- species_yass_grouped |>
  group_by(conservation_status) |>
  summarise(n = n()) |>
  mutate(proportion = n/sum(n)*100)

# Count number of species by taxonomic group
taxa_table <- species_yass_grouped |>
  filter(conservation_status %in% c("Sensitive", "Threatened")) |>
  group_by(taxa_group) |>
  summarise(n = n()) |>
  mutate(proportion = n/sum(n)*100)

# Extract percentage that are threatened/sensitive species
prop_threatened_or_sensitive <- status_table |>
  filter(conservation_status %in% c("Sensitive", "Threatened")) |>
  summarise(total = sum(proportion)) |>
  pull(total) |>
  round(2)

# Add nicer font
font_add_google("Roboto", "roboto")
showtext_auto()

# Plot 1 Waffle: Conservation Status
waffle_status <- 
  ggplot(status_table) +
  waffle::geom_waffle(aes(fill = conservation_status,
                          colour = conservation_status,
                          values = n),
                      n_rows = 17,
                      height = 0.75,
                      width = 0.75,
                      size = 1) +
  scale_colour_manual(name = "Conservation\nStatus",
                    values = c("#F3E6DC", "#D89A98", "#AB423F"),
                    labels = c("No status", "Sensitive", "Threatened")) +
  scale_fill_manual(name = "Conservation\nStatus",
                    values = c("#F3E6DC", "#D89A98", "#AB423F"),
                    labels = c("No status", "Sensitive", "Threatened")) +
  labs(title = glue::glue("{prop_threatened_or_sensitive}% of total species in \\
                          Yass Valley are threatened or sensitive"),
       caption = "1 square = 1 species") +
  coord_equal() + 
  theme_void() + 
  theme(legend.position = "bottom",
        text = element_text(family = "roboto", lineheight = 0.5),
        legend.title = element_text(hjust = 0.5, size = 20),
        legend.text = element_text(size = 19),
        plot.title = element_text(hjust = 0.5, size = 25),
        plot.caption = element_text(size = 17),
        plot.margin = margin(0.5, 1, 0.5, 1, unit = "cm"))

# Plot 2: Taxonomic breakdown
waffle_taxa <- 
  ggplot(taxa_table) +
  waffle::geom_waffle(aes(fill = taxa_group,
                          colour = taxa_group,
                          values = n),
                      n_rows = 4,
                      height = 0.75,
                      width = 0.75,
                      size = 1) +
  scale_colour_manual(name = "Group",
                    values = c("#567C7C", "#6D714A", "#465743", "#22352C", "#C4AC79"),
                    labels = c("Birds", "Mammals", "Other", "Plants", "Reptiles")) +
  scale_fill_manual(name = "Group",
                    values = c("#567C7C", "#6D714A", "#465743", "#22352C", "#C4AC79"),
                    labels = c("Birds", "Mammals", "Other", "Plants", "Reptiles")) +
  labs(title = "Taxonomic breakdown of threatened & sensitive species",
       caption = "1 square = 1 species") +
  coord_equal() + 
  theme_void() + 
  theme(legend.position = "bottom",
        text = element_text(family = "roboto"),
        legend.title = element_text(hjust = 0.5, size = 20),
        legend.text = element_text(size = 19),
        plot.title = element_text(hjust = 0.5, size = 25),
        plot.caption = element_text(size = 17, hjust = 1),
        plot.margin = margin(0.5, 2.5, 0.5, 2.5, unit = "cm"))

Final thoughts

We hope this post has helped you understand how to download a species list for a specific area and compare it to conservation lists. It’s also possible to compare species with other information like lists of migratory species or seasonal species.

For other posts, check out our beginner’s guide to map species observations or see an investigation of dingo observations in the ALA.

Expand for session info

─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.4.1 (2024-06-14 ucrt)
 os       Windows 10 x64 (build 19045)
 system   x86_64, mingw32
 ui       RTerm
 language (EN)
 collate  English_Australia.utf8
 ctype    English_Australia.utf8
 tz       Australia/Sydney
 date     2024-08-20
 pandoc   3.1.11 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)

─ Packages ───────────────────────────────────────────────────────────────────
 package     * version date (UTC) lib source
 dplyr       * 1.1.4   2023-11-17 [1] CRAN (R 4.3.2)
 galah       * 2.0.2   2024-04-12 [1] CRAN (R 4.4.1)
 ggplot2     * 3.5.1   2024-04-23 [1] CRAN (R 4.4.0)
 glue        * 1.6.2   2022-02-24 [1] CRAN (R 4.3.2)
 here        * 1.0.1   2020-12-13 [1] CRAN (R 4.3.2)
 htmltools   * 0.5.7   2023-11-03 [1] CRAN (R 4.3.2)
 ozmaps      * 0.4.5   2021-08-03 [1] CRAN (R 4.3.2)
 readr       * 2.1.5   2024-01-10 [1] CRAN (R 4.3.3)
 rmapshaper  * 0.5.0   2023-04-11 [1] CRAN (R 4.3.2)
 sessioninfo * 1.2.2   2021-12-06 [1] CRAN (R 4.3.2)
 sf          * 1.0-16  2024-03-24 [1] CRAN (R 4.3.3)
 showtext    * 0.9-6   2023-05-03 [1] CRAN (R 4.3.2)
 showtextdb  * 3.0     2020-06-04 [1] CRAN (R 4.3.2)
 sysfonts    * 0.8.8   2022-03-13 [1] CRAN (R 4.3.2)
 waffle      * 1.0.2   2024-05-03 [1] Github (hrbrmstr/waffle@767875b)

 [1] C:/Users/KEL329/R-packages
 [2] C:/Users/KEL329/AppData/Local/Programs/R/R-4.4.1/library

──────────────────────────────────────────────────────────────────────────────

Footnotes

  1. Each spatial layer has a two letter code, along with a number to identify it. The abbreviations are as follows: * cl = contextual layer (i.e. boundaries of LGAs, Indigenous Protected Areas, States/Territories etc.)
    * 10923 = number associated with the spatial layer in the atlas↩︎

  2. Simplifying a shapefile removes the number of total points that draw the shape outline.↩︎

  3. Check out this post for a better explanation of what CRS is and how it affects maps.↩︎

  4. These are the same two lists that you can access in galah, available from the Atlas of Living Australia. Keep in mind that if you use an external list, data cleaning may be required before matching species names.↩︎